<!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
-->
<html>

<head>
<meta http-equiv="Content-Language" content="en-us">
<link rel="stylesheet" type="text/css" href="../stylesheets/style.css">
<title>XMLCatalog Type</title>
</head>

<body>

<h2 id="XMLCatalog">XMLCatalog</h2>

<p>An XMLCatalog is a catalog of public resources such as DTDs or entities that are referenced in an
XML document.  Catalogs are typically used to make web references to resources point to a locally
cached copy of the resource.</p>

<p>This allows the XML Parser, XSLT Processor or other consumer of XML documents to efficiently
allow a local substitution for a resource available on the web.</p>
<p><strong>Note:</strong> This task <em>uses, but does not depend on</em> external libraries not
included in the Apache Ant distribution.  See <a href="../install.html#librarydependencies">Library
Dependencies</a> for more information.</p>
<p>This data type provides a catalog of resource locations based on
the <a href="https://www.oasis-open.org/committees/download.php/14809/xml-catalogs.html"
target="_top">OASIS XML Catalog standard</a>.  The catalog entries are used both for Entity
resolution and URI resolution, in accordance with
the <code class="code">org.xml.sax.EntityResolver</code>
and <code class="code">javax.xml.transform.URIResolver</code> interfaces as defined in
the <a href="https://download.oracle.com/otn-pub/jcp/jaxp-1_6-mrel3-spec/JAXP1_6-FinalSpec.pdf"
target="_top">Java API for XML Processing (JAXP) Specification</a>.</p>
<p>For example, in a <code>web.xml</code> file, the DTD is referenced as:</p>
<pre>
&lt;!DOCTYPE web-app PUBLIC &quot;-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN&quot;
  &quot;http://java.sun.com/j2ee/dtds/web-app_2_2.dtd&quot;&gt;</pre>
<p>The XML processor, without XMLCatalog support, would need to retrieve the DTD from the URL
specified whenever validation of the document was required.</p>
<p>This can be very time consuming during the build process, especially where network throughput
is limited.  Alternatively, you can do the following:</p>
<ol>
<li>Copy <samp>web-app_2_2.dtd</samp> onto your local disk somewhere (either in the filesystem
or even embedded inside a jar or zip file on the classpath).</li>
<li>Create an <code>&lt;xmlcatalog&gt;</code> with a <code>&lt;dtd&gt;</code> element
whose <var>location</var> attribute points to the file.</li>
<li>Success! The XML processor will now use the local copy instead of calling out to the
internet.</li>
</ol>
<p>XMLCatalogs can appear inside tasks that support this feature or at the same level
as <code>target</code>&mdash;i.e., as children of <code>project</code> for reuse across
different tasks, e.g. XML Validation and XSLT Transformation.  The XML Validate task uses
XMLCatalogs for entity resolution.  The XSLT Transformation task uses XMLCatalogs for both
entity and URI resolution.</p>
<p>XMLCatalogs are specified as either a reference to another XMLCatalog, defined previously in
a build file, or as a list of <code>dtd</code> or <code>entity</code> locations.  In addition,
external catalog files may be specified in a nested <code>catalogpath</code>, but they will be
ignored unless the resolver library from xml-commons is available in the system
classpath.  <strong>Due to backwards incompatible changes in the resolver code after the release
of resolver 1.0, Ant supports only resolver 1.1 or later.</strong>  A separate classpath for
entity resolution may be specified inline via nested <code>classpath</code> elements; otherwise
the system classpath is used for this as well.</p>
<p>XMLCatalogs can also be nested inside other XMLCatalogs.  For example, a "superset"
XMLCatalog could be made by including several nested XMLCatalogs that referred to other,
previously defined XMLCatalogs.</p>
<p>Resource locations can be specified either in-line or in external catalog file(s), or both.  In
order to use an external catalog file, the xml-commons resolver library (<samp>resolver.jar</samp>)
must be in your path.  External catalog files may be
either <a href="https://oasis-open.org/committees/entity/background/9401.html" target="_top">plain
text format</a> or <a href="https://www.oasis-open.org/committees/entity/spec-2001-08-06.html"
target="_top">XML format</a>.  If the xml-commons resolver library is not found in the classpath,
external catalog files, specified in <code>catalogpath</code>, will be ignored and a warning will be
logged.  In this case, however, processing of inline entries will proceed normally.</p>
<p>Currently, only <code>&lt;dtd&gt;</code> and <code>&lt;entity&gt;</code> elements may be
specified inline; these roughly correspond to OASIS catalog entry types <code>PUBLIC</code>
and <code>URI</code> respectively.  By contrast, external catalog files may use any of the entry
types defined in
the <a href="https://www.oasis-open.org/committees/download.php/14809/xml-catalogs.html"
target="_top">OASIS specification</a>.</p>
<h3 id="ResolverAlgorithm">Entity/DTD/URI resolution algorithm</h3>
<p>When an entity, DTD, or URI is looked up by the XML processor, the XMLCatalog searches its list
of entries to see if any match.  That is, it attempts to match the <var>publicId</var> attribute of
each entry with the PublicID or URI of the entity to be resolved.  Assuming a matching entry is
found, XMLCatalog then executes the following steps:</p>

<h4>1. Filesystem lookup</h4>

<p>The <var>location</var> is first looked up in the filesystem.  If the <var>location</var> is a
relative path, the Ant project <var>basedir</var> attribute is used as the base directory.  If
the <var>location</var> specifies an absolute path, it is used as is.  Once we have an absolute path
in hand, we check to see if a valid and readable file exists at that path.  If so, we are done.  If
not, we proceed to the next step.</p>

<h4>2. Classpath lookup</h4>

<p>The <var>location</var> is next looked up in the classpath.  Recall that jar files are merely
fancy zip files.  For classpath lookup, the <var>location</var> is used as is (no base is
prepended).  We use a Classloader to attempt to load the resource from the classpath.  For example,
if <samp>hello.jar</samp> is in the classpath and it contains <samp>foo/bar/blat.dtd</samp> it will
resolve an entity whose <var>location</var> is <samp>foo/bar/blat.dtd</samp>.  Of course, it
will <em>not</em> resolve an entity whose <var>location</var> is <code>blat.dtd</code>.</p>

<h4>3a. Apache xml-commons resolver lookup</h4>

<p>What happens next depends on whether the resolver library from xml-commons is available on the
classpath.  If so, we defer all further attempts at resolving to it.  The resolver library supports
extremely sophisticated functionality like URL rewriting and so on, which can be accessed by making
the appropriate entries in external catalog files (XMLCatalog does not yet provide inline support
for all of the entries defined in
the <a href="https://www.oasis-open.org/committees/download.php/14809/xml-catalogs.html"
target="_top">OASIS standard</a>).</p>

<h4>3. URL-space lookup</h4>

<p>Finally, we attempt to make a URL out of the <var>location</var>.  At first this may seem like
this would defeat the purpose of XMLCatalogs&mdash;why go back out to the internet?  But in fact,
this can be used to (in a sense) implement HTTP redirects, substituting one URL for another.  The
mapped-to URL might also be served by a local web server.  If the URL resolves to a valid and
readable resource, we are done.  Otherwise, we give up.  In this case, the XML processor will
perform its normal resolution algorithm.  Depending on the processor configuration, further
resolution failures may or may not result in fatal (i.e. build-ending) errors.</p>

<h3>XMLCatalog attributes</h3>
<table class="attr">
  <tr>
    <th scope="col">Attribute</th>
    <th scope="col">Description</th>
    <th scope="col">Required</th>
  </tr>
  <tr>
    <td>id</td>
    <td>a unique name for an XMLCatalog, used for referencing the XMLCatalog's contents from
    another XMLCatalog</td>
    <td>No</td>
  </tr>
  <tr>
    <td>refid</td>
    <td>the <var>id</var> of another XMLCatalog whose contents you would like to be used for
    this XMLCatalog</td>
    <td>No</td>
  </tr>
</table>

<h3>XMLCatalog nested elements</h3>
<h4>dtd/entity</h4>
<p>The <code>dtd</code> and <code>entity</code> elements used to specify XMLCatalogs are
identical in their structure</p>
<table class="attr">
  <tr>
    <th scope="col">Attribute</th>
    <th scope="col">Description</th>
    <th scope="col">Required</th>
  </tr>
  <tr>
    <td>publicId</td>
    <td>The public identifier used when defining a dtd or entity, e.g.
    <code>&quot;-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN&quot;</code></td>
    <td>Yes</td>
  </tr>
  <tr>
    <td>location</td>
    <td>The location of the local replacement to be used for the public identifier specified. This
    may be specified as a file name, resource name found on the classpath, or a URL.  Relative paths
    will be resolved according to the base, which by default is the Ant
    project <var>basedir</var>.</td>
    <td>Yes</td>
  </tr>
</table>
<h4>classpath</h4>
<p>The classpath to use for <a href="#ResolverAlgorithm">entity resolution</a>.  The
nested <code>&lt;classpath&gt;</code> is a <a href="../using.html#path">path</a>-like structure.</p>
<h4>catalogpath</h4>
<p>The nested <code>catalogpath</code> element is a <a href="../using.html#path">path</a>-like
structure listing catalog files to search.  All files in this path are assumed to be OASIS catalog
files, in either <a href="https://oasis-open.org/committees/entity/background/9401.html"
target="_top">plain text format</a>
or <a href="https://www.oasis-open.org/committees/entity/spec-2001-08-06.html" target="_top">XML
format</a>. Entries specifying nonexistent files will be ignored. If the resolver library from
xml-commons is not available in the classpath, all <code>catalogpath</code>s will be ignored and a
warning will be logged.</p>
<h3>Examples</h3>
<p>Set up an XMLCatalog with a single DTD referenced locally in a user's home directory:</p>
<pre>
&lt;xmlcatalog&gt;
    &lt;dtd publicId=&quot;-//OASIS//DTD DocBook XML V4.1.2//EN&quot;
         location=&quot;/home/dion/downloads/docbook/docbookx.dtd&quot;/&gt;
&lt;/xmlcatalog&gt;</pre>
<p>Set up an XMLCatalog with a multiple DTDs to be found either in the filesystem (relative to
the Ant project <var>basedir</var>) or in the classpath:
</p>
<pre>
&lt;xmlcatalog id=&quot;commonDTDs&quot;&gt;
    &lt;dtd publicId=&quot;-//OASIS//DTD DocBook XML V4.1.2//EN&quot;
         location=&quot;docbook/docbookx.dtd&quot;/&gt;
    &lt;dtd publicId=&quot;-//Sun Microsystems, Inc.//DTD Web Application 2.2//EN&quot;
         location=&quot;web-app_2_2.dtd&quot;/&gt;
&lt;/xmlcatalog&gt;</pre>

<p>Set up an XMLCatalog with a combination of DTDs and entities as well as a nested XMLCatalog
and external catalog files in both formats:</p>

<pre>
&lt;xmlcatalog id=&quot;allcatalogs&quot;&gt;
    &lt;dtd publicId=&quot;-//ArielPartners//DTD XML Article V1.0//EN&quot;
         location=&quot;com/arielpartners/knowledgebase/dtd/article.dtd&quot;/&gt;
    &lt;entity publicId=&quot;LargeLogo&quot;
            location=&quot;com/arielpartners/images/ariel-logo-large.gif&quot;/&gt;
    &lt;xmlcatalog refid="commonDTDs"/&gt;
        &lt;catalogpath&gt;
            &lt;pathelement location="/etc/sgml/catalog"/&gt;
            &lt;fileset dir=&quot;/anetwork/drive&quot;
                     includes=&quot;**/catalog&quot;/&gt;
            &lt;fileset dir=&quot;/my/catalogs&quot;
                     includes=&quot;**/catalog.xml&quot;/&gt;
        &lt;/catalogpath&gt;
    &lt;/xmlcatalog&gt;
&lt;/xmlcatalog&gt;</pre>
<p>To reference the above XMLCatalog in an <code>xslt</code> task:<p>
<pre>
&lt;xslt basedir="${source.doc}"
      destdir="${dest.xdocs}"
      extension=".xml"
      style="${source.xsl.converter.docbook}"
      includes="**/*.xml"
      force="true"&gt;
    &lt;xmlcatalog refid=&quot;allcatalogs&quot;/&gt;
&lt;/xslt&gt;</pre>

</body>
</html>
